Examples

Release Highlights

These examples illustrate the main features of the releases of scikit-learn.

Release Highlights for scikit-learn 0.23

Release Highlights for scikit-learn 0.23

Release Highlights for scikit-learn 0.24

Release Highlights for scikit-learn 0.24

Release Highlights for scikit-learn 0.22

Release Highlights for scikit-learn 0.22

Biclustering

Examples concerning the sklearn.cluster.bicluster module.

A demo of the Spectral Co-Clustering algorithm

A demo of the Spectral Co-Clustering algorithm

A demo of the Spectral Biclustering algorithm

A demo of the Spectral Biclustering algorithm

Biclustering documents with the Spectral Co-clustering algorithm

Biclustering documents with the Spectral Co-clustering algorithm

Calibration

Examples illustrating the calibration of predicted probabilities of classifiers.

Comparison of Calibration of Classifiers

Comparison of Calibration of Classifiers

Probability Calibration curves

Probability Calibration curves

Probability calibration of classifiers

Probability calibration of classifiers

Probability Calibration for 3-class classification

Probability Calibration for 3-class classification

Classification

General examples about classification algorithms.

Recognizing hand-written digits

Recognizing hand-written digits

Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification

Normal, Ledoit-Wolf and OAS Linear Discriminant Analysis for classification

Plot classification probability

Plot classification probability

Classifier comparison

Classifier comparison

Linear and Quadratic Discriminant Analysis with covariance ellipsoid

Linear and Quadratic Discriminant Analysis with covariance ellipsoid

Clustering

Examples concerning the sklearn.cluster module.

An example of K-Means++ initialization

An example of K-Means++ initialization

Plot Hierarchical Clustering Dendrogram

Plot Hierarchical Clustering Dendrogram

Feature agglomeration

Feature agglomeration

A demo of the mean-shift clustering algorithm

A demo of the mean-shift clustering algorithm

Demonstration of k-means assumptions

Demonstration of k-means assumptions

Online learning of a dictionary of parts of faces

Online learning of a dictionary of parts of faces

Vector Quantization Example

Vector Quantization Example

Demo of affinity propagation clustering algorithm

Demo of affinity propagation clustering algorithm

Agglomerative clustering with and without structure

Agglomerative clustering with and without structure

Various Agglomerative Clustering on a 2D embedding of digits

Various Agglomerative Clustering on a 2D embedding of digits

Segmenting the picture of greek coins in regions

Segmenting the picture of greek coins in regions

K-means Clustering

K-means Clustering

Spectral clustering for image segmentation

Spectral clustering for image segmentation

A demo of structured Ward hierarchical clustering on an image of coins

A demo of structured Ward hierarchical clustering on an image of coins

Demo of DBSCAN clustering algorithm

Demo of DBSCAN clustering algorithm

Color Quantization using K-Means

Color Quantization using K-Means

Hierarchical clustering: structured vs unstructured ward

Hierarchical clustering: structured vs unstructured ward

Agglomerative clustering with different metrics

Agglomerative clustering with different metrics

Inductive Clustering

Inductive Clustering

Demo of OPTICS clustering algorithm

Demo of OPTICS clustering algorithm

Compare BIRCH and MiniBatchKMeans

Compare BIRCH and MiniBatchKMeans

Empirical evaluation of the impact of k-means initialization

Empirical evaluation of the impact of k-means initialization

Adjustment for chance in clustering performance evaluation

Adjustment for chance in clustering performance evaluation

Comparison of the K-Means and MiniBatchKMeans clustering algorithms

Comparison of the K-Means and MiniBatchKMeans clustering algorithms

Feature agglomeration vs. univariate selection

Feature agglomeration vs. univariate selection

Comparing different hierarchical linkage methods on toy datasets

Comparing different hierarchical linkage methods on toy datasets

A demo of K-Means clustering on the handwritten digits data

A demo of K-Means clustering on the handwritten digits data

Selecting the number of clusters with silhouette analysis on KMeans clustering

Selecting the number of clusters with silhouette analysis on KMeans clustering

Comparing different clustering algorithms on toy datasets

Comparing different clustering algorithms on toy datasets

Covariance estimation

Examples concerning the sklearn.covariance module.

Ledoit-Wolf vs OAS estimation

Ledoit-Wolf vs OAS estimation

Sparse inverse covariance estimation

Sparse inverse covariance estimation

Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood

Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood

Robust covariance estimation and Mahalanobis distances relevance

Robust covariance estimation and Mahalanobis distances relevance

Robust vs Empirical covariance estimate

Robust vs Empirical covariance estimate

Cross decomposition

Examples concerning the sklearn.cross_decomposition module.

Principal Component Regression vs Partial Least Squares Regression

Principal Component Regression vs Partial Least Squares Regression

Compare cross decomposition methods

Compare cross decomposition methods

Dataset examples

Examples concerning the sklearn.datasets module.

The Digit Dataset

The Digit Dataset

The Iris Dataset

The Iris Dataset

Plot randomly generated classification dataset

Plot randomly generated classification dataset

Plot randomly generated multilabel dataset

Plot randomly generated multilabel dataset

Decision Trees

Examples concerning the sklearn.tree module.

Decision Tree Regression

Decision Tree Regression

Multi-output Decision Tree Regression

Multi-output Decision Tree Regression

Plot the decision surface of a decision tree on the iris dataset

Plot the decision surface of a decision tree on the iris dataset

Post pruning decision trees with cost complexity pruning

Post pruning decision trees with cost complexity pruning

Understanding the decision tree structure

Understanding the decision tree structure

Decomposition

Examples concerning the sklearn.decomposition module.

Beta-divergence loss functions

Beta-divergence loss functions

PCA example with Iris Data-set

PCA example with Iris Data-set

Incremental PCA

Incremental PCA

Comparison of LDA and PCA 2D projection of Iris dataset

Comparison of LDA and PCA 2D projection of Iris dataset

Factor Analysis (with rotation) to visualize patterns

Factor Analysis (with rotation) to visualize patterns

Blind source separation using FastICA

Blind source separation using FastICA

Principal components analysis (PCA)

Principal components analysis (PCA)

FastICA on 2D point clouds

FastICA on 2D point clouds

Kernel PCA

Kernel PCA

Model selection with Probabilistic PCA and Factor Analysis (FA)

Model selection with Probabilistic PCA and Factor Analysis (FA)

Sparse coding with a precomputed dictionary

Sparse coding with a precomputed dictionary

Image denoising using dictionary learning

Image denoising using dictionary learning

Faces dataset decompositions

Faces dataset decompositions

Ensemble methods

Examples concerning the sklearn.ensemble module.

Pixel importances with a parallel forest of trees

Pixel importances with a parallel forest of trees

Decision Tree Regression with AdaBoost

Decision Tree Regression with AdaBoost

Plot individual and voting regression predictions

Plot individual and voting regression predictions

Feature importances with forests of trees

Feature importances with forests of trees

IsolationForest example

IsolationForest example

Monotonic Constraints

Monotonic Constraints

Plot the decision boundaries of a VotingClassifier

Plot the decision boundaries of a VotingClassifier

Comparing random forests and the multi-output meta estimator

Comparing random forests and the multi-output meta estimator

Prediction Intervals for Gradient Boosting Regression

Prediction Intervals for Gradient Boosting Regression

Gradient Boosting regularization

Gradient Boosting regularization

Plot class probabilities calculated by the VotingClassifier

Plot class probabilities calculated by the VotingClassifier

Gradient Boosting regression

Gradient Boosting regression

OOB Errors for Random Forests

OOB Errors for Random Forests

Two-class AdaBoost

Two-class AdaBoost

Hashing feature transformation using Totally Random Trees

Hashing feature transformation using Totally Random Trees

Multi-class AdaBoosted Decision Trees

Multi-class AdaBoosted Decision Trees

Discrete versus Real AdaBoost

Discrete versus Real AdaBoost

Early stopping of Gradient Boosting

Early stopping of Gradient Boosting

Feature transformations with ensembles of trees

Feature transformations with ensembles of trees

Gradient Boosting Out-of-Bag estimates

Gradient Boosting Out-of-Bag estimates

Single estimator versus bagging: bias-variance decomposition

Single estimator versus bagging: bias-variance decomposition

Categorical Feature Support in Gradient Boosting

Categorical Feature Support in Gradient Boosting

Plot the decision surfaces of ensembles of trees on the iris dataset

Plot the decision surfaces of ensembles of trees on the iris dataset

Combine predictors using stacking

Combine predictors using stacking

Examples based on real world datasets

Applications to real world problems with some medium sized datasets or interactive user interface.

Outlier detection on a real data set

Outlier detection on a real data set

Compressive sensing: tomography reconstruction with L1 prior (Lasso)

Compressive sensing: tomography reconstruction with L1 prior (Lasso)

Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation

Topic extraction with Non-negative Matrix Factorization and Latent Dirichlet Allocation

Faces recognition example using eigenfaces and SVMs

Faces recognition example using eigenfaces and SVMs

Model Complexity Influence

Model Complexity Influence

Visualizing the stock market structure

Visualizing the stock market structure

Wikipedia principal eigenvector

Wikipedia principal eigenvector

Species distribution modeling

Species distribution modeling

Libsvm GUI

Libsvm GUI

Prediction Latency

Prediction Latency

Out-of-core classification of text documents

Out-of-core classification of text documents

Feature Selection

Examples concerning the sklearn.feature_selection module.

Recursive feature elimination

Recursive feature elimination

Comparison of F-test and mutual information

Comparison of F-test and mutual information

Pipeline Anova SVM

Pipeline Anova SVM

Recursive feature elimination with cross-validation

Recursive feature elimination with cross-validation

Model-based and sequential feature selection

Model-based and sequential feature selection

Test with permutations the significance of a classification score

Test with permutations the significance of a classification score

Univariate Feature Selection

Univariate Feature Selection

Gaussian Mixture Models

Examples concerning the sklearn.mixture module.

Density Estimation for a Gaussian mixture

Density Estimation for a Gaussian mixture

Gaussian Mixture Model Ellipsoids

Gaussian Mixture Model Ellipsoids

Gaussian Mixture Model Selection

Gaussian Mixture Model Selection

GMM covariances

GMM covariances

Gaussian Mixture Model Sine Curve

Gaussian Mixture Model Sine Curve

Concentration Prior Type Analysis of Variation Bayesian Gaussian Mixture

Concentration Prior Type Analysis of Variation Bayesian Gaussian Mixture

Gaussian Process for Machine Learning

Examples concerning the sklearn.gaussian_process module.

Illustration of Gaussian process classification (GPC) on the XOR dataset

Illustration of Gaussian process classification (GPC) on the XOR dataset

Gaussian process classification (GPC) on iris dataset

Gaussian process classification (GPC) on iris dataset

Comparison of kernel ridge and Gaussian process regression

Comparison of kernel ridge and Gaussian process regression

Illustration of prior and posterior Gaussian process for different kernels

Illustration of prior and posterior Gaussian process for different kernels

Iso-probability lines for Gaussian Processes classification (GPC)

Iso-probability lines for Gaussian Processes classification (GPC)

Probabilistic predictions with Gaussian process classification (GPC)

Probabilistic predictions with Gaussian process classification (GPC)

Gaussian process regression (GPR) with noise-level estimation

Gaussian process regression (GPR) with noise-level estimation

Gaussian Processes regression: basic introductory example

Gaussian Processes regression: basic introductory example

Gaussian process regression (GPR) on Mauna Loa CO2 data.

Gaussian process regression (GPR) on Mauna Loa CO2 data.

Gaussian processes on discrete data structures

Gaussian processes on discrete data structures

Generalized Linear Models

Examples concerning the sklearn.linear_model module.

Lasso path using LARS

Lasso path using LARS

Plot Ridge coefficients as a function of the regularization

Plot Ridge coefficients as a function of the regularization

SGD: Maximum margin separating hyperplane

SGD: Maximum margin separating hyperplane

SGD: convex loss functions

SGD: convex loss functions

Ordinary Least Squares and Ridge Regression Variance

Ordinary Least Squares and Ridge Regression Variance

Plot Ridge coefficients as a function of the L2 regularization

Plot Ridge coefficients as a function of the L2 regularization

SGD: Penalties

SGD: Penalties

Logistic function

Logistic function

Polynomial interpolation

Polynomial interpolation

Regularization path of L1- Logistic Regression

Regularization path of L1- Logistic Regression

Logistic Regression 3-class Classifier

Logistic Regression 3-class Classifier

SGD: Weighted samples

SGD: Weighted samples

Non-negative least squares

Non-negative least squares

Linear Regression Example

Linear Regression Example

Robust linear model estimation using RANSAC

Robust linear model estimation using RANSAC

Sparsity Example: Fitting only features 1  and 2

Sparsity Example: Fitting only features 1 and 2

HuberRegressor vs Ridge on dataset with strong outliers

HuberRegressor vs Ridge on dataset with strong outliers

Lasso on dense and sparse data

Lasso on dense and sparse data

Comparing various online solvers

Comparing various online solvers

Joint feature selection with multi-task Lasso

Joint feature selection with multi-task Lasso

MNIST classification using multinomial logistic + L1

MNIST classification using multinomial logistic + L1

Plot multi-class SGD on the iris dataset

Plot multi-class SGD on the iris dataset

Orthogonal Matching Pursuit

Orthogonal Matching Pursuit

Lasso and Elastic Net for Sparse Signals

Lasso and Elastic Net for Sparse Signals

Curve Fitting with Bayesian Ridge Regression

Curve Fitting with Bayesian Ridge Regression

Theil-Sen Regression

Theil-Sen Regression

Plot multinomial and One-vs-Rest Logistic Regression

Plot multinomial and One-vs-Rest Logistic Regression

Robust linear estimator fitting

Robust linear estimator fitting

L1 Penalty and Sparsity in Logistic Regression

L1 Penalty and Sparsity in Logistic Regression

Lasso and Elastic Net

Lasso and Elastic Net

Automatic Relevance Determination Regression (ARD)

Automatic Relevance Determination Regression (ARD)

Bayesian Ridge Regression

Bayesian Ridge Regression

Lasso model selection: Cross-Validation / AIC / BIC

Lasso model selection: Cross-Validation / AIC / BIC

Multiclass sparse logistic regression on 20newgroups

Multiclass sparse logistic regression on 20newgroups

Early stopping of Stochastic Gradient Descent

Early stopping of Stochastic Gradient Descent

Poisson regression and non-normal loss

Poisson regression and non-normal loss

Tweedie regression on insurance claims

Tweedie regression on insurance claims

Inspection

Examples related to the sklearn.inspection module.

Permutation Importance with Multicollinear or Correlated Features

Permutation Importance with Multicollinear or Correlated Features

Permutation Importance vs Random Forest Feature Importance (MDI)

Permutation Importance vs Random Forest Feature Importance (MDI)

Partial Dependence and Individual Conditional Expectation Plots

Partial Dependence and Individual Conditional Expectation Plots

Common pitfalls in interpretation of coefficients of linear models

Common pitfalls in interpretation of coefficients of linear models

Kernel Approximation

Examples concerning the sklearn.kernel_approximation module.

Scalable learning with polynomial kernel aproximation

Scalable learning with polynomial kernel aproximation

Manifold learning

Examples concerning the sklearn.manifold module.

Swiss Roll reduction with LLE

Swiss Roll reduction with LLE

Comparison of Manifold Learning methods

Comparison of Manifold Learning methods

Multi-dimensional scaling

Multi-dimensional scaling

t-SNE: The effect of various perplexity values on the shape

t-SNE: The effect of various perplexity values on the shape

Manifold Learning methods on a severed sphere

Manifold Learning methods on a severed sphere

Manifold learning on handwritten digits: Locally Linear Embedding, Isomap...

Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…

Miscellaneous

Miscellaneous and introductory examples for scikit-learn.

Compact estimator representations

Compact estimator representations

ROC Curve with Visualization API

ROC Curve with Visualization API

Visualizations with Display Objects

Visualizations with Display Objects

Isotonic Regression

Isotonic Regression

Advanced Plotting With Partial Dependence

Advanced Plotting With Partial Dependence

Face completion with a multi-output estimators

Face completion with a multi-output estimators

Multilabel classification

Multilabel classification

Comparing anomaly detection algorithms for outlier detection on toy datasets

Comparing anomaly detection algorithms for outlier detection on toy datasets

The Johnson-Lindenstrauss bound for embedding with random projections

The Johnson-Lindenstrauss bound for embedding with random projections

Comparison of kernel ridge regression and SVR

Comparison of kernel ridge regression and SVR

Explicit feature map approximation for RBF kernels

Explicit feature map approximation for RBF kernels

Missing Value Imputation

Examples concerning the sklearn.impute module.

Imputing missing values with variants of IterativeImputer

Imputing missing values with variants of IterativeImputer

Imputing missing values before building an estimator

Imputing missing values before building an estimator

Model Selection

Examples related to the sklearn.model_selection module.

Plotting Cross-Validated Predictions

Plotting Cross-Validated Predictions

Confusion matrix

Confusion matrix

Plotting Validation Curves

Plotting Validation Curves

Detection error tradeoff (DET) curve

Detection error tradeoff (DET) curve

Successive Halving Iterations

Successive Halving Iterations

Underfitting vs. Overfitting

Underfitting vs. Overfitting

Parameter estimation using grid search with cross-validation

Parameter estimation using grid search with cross-validation

Comparing randomized search and grid search for hyperparameter estimation

Comparing randomized search and grid search for hyperparameter estimation

Train error vs Test error

Train error vs Test error

Receiver Operating Characteristic (ROC) with cross validation

Receiver Operating Characteristic (ROC) with cross validation

Nested versus non-nested cross-validation

Nested versus non-nested cross-validation

Demonstration of multi-metric evaluation on cross_val_score and GridSearchCV

Demonstration of multi-metric evaluation on cross_val_score and GridSearchCV

Sample pipeline for text feature extraction and evaluation

Sample pipeline for text feature extraction and evaluation

Balance model complexity and cross-validated score

Balance model complexity and cross-validated score

Comparison between grid search and successive halving

Comparison between grid search and successive halving

Visualizing cross-validation behavior in scikit-learn

Visualizing cross-validation behavior in scikit-learn

Receiver Operating Characteristic (ROC)

Receiver Operating Characteristic (ROC)

Precision-Recall

Precision-Recall

Plotting Learning Curves

Plotting Learning Curves

Statistical comparison of models using grid search

Statistical comparison of models using grid search

Multioutput methods

Examples concerning the sklearn.multioutput module.

Classifier Chain

Classifier Chain

Nearest Neighbors

Examples concerning the sklearn.neighbors module.

Nearest Neighbors regression

Nearest Neighbors regression

Outlier detection with Local Outlier Factor (LOF)

Outlier detection with Local Outlier Factor (LOF)

Nearest Centroid Classification

Nearest Centroid Classification

Kernel Density Estimation

Kernel Density Estimation

Nearest Neighbors Classification

Nearest Neighbors Classification

Caching nearest neighbors

Caching nearest neighbors

Neighborhood Components Analysis Illustration

Neighborhood Components Analysis Illustration

Novelty detection with Local Outlier Factor (LOF)

Novelty detection with Local Outlier Factor (LOF)

Comparing Nearest Neighbors with and without Neighborhood Components Analysis

Comparing Nearest Neighbors with and without Neighborhood Components Analysis

Dimensionality Reduction with Neighborhood Components Analysis

Dimensionality Reduction with Neighborhood Components Analysis

Kernel Density Estimate of Species Distributions

Kernel Density Estimate of Species Distributions

Simple 1D Kernel Density Estimation

Simple 1D Kernel Density Estimation

Approximate nearest neighbors in TSNE

Approximate nearest neighbors in TSNE

Neural Networks

Examples concerning the sklearn.neural_network module.

Visualization of MLP weights on MNIST

Visualization of MLP weights on MNIST

Restricted Boltzmann Machine features for digit classification

Restricted Boltzmann Machine features for digit classification

Compare Stochastic learning strategies for MLPClassifier

Compare Stochastic learning strategies for MLPClassifier

Varying regularization in Multi-layer Perceptron

Varying regularization in Multi-layer Perceptron

Pipelines and composite estimators

Examples of how to compose transformers and pipelines from other estimators. See the User Guide.

Concatenating multiple feature extraction methods

Concatenating multiple feature extraction methods

Pipelining: chaining a PCA and a logistic regression

Pipelining: chaining a PCA and a logistic regression

Selecting dimensionality reduction with Pipeline and GridSearchCV

Selecting dimensionality reduction with Pipeline and GridSearchCV

Column Transformer with Mixed Types

Column Transformer with Mixed Types

Column Transformer with Heterogeneous Data Sources

Column Transformer with Heterogeneous Data Sources

Effect of transforming the targets in regression model

Effect of transforming the targets in regression model

Preprocessing

Examples concerning the sklearn.preprocessing module.

Using KBinsDiscretizer to discretize continuous features

Using KBinsDiscretizer to discretize continuous features

Demonstrating the different strategies of KBinsDiscretizer

Demonstrating the different strategies of KBinsDiscretizer

Importance of Feature Scaling

Importance of Feature Scaling

Map data to a normal distribution

Map data to a normal distribution

Feature discretization

Feature discretization

Compare the effect of different scalers on data with outliers

Compare the effect of different scalers on data with outliers

Semi Supervised Classification

Examples concerning the sklearn.semi_supervised module.

Label Propagation learning a complex structure

Label Propagation learning a complex structure

Label Propagation digits: Demonstrating performance

Label Propagation digits: Demonstrating performance

Decision boundary of semi-supervised classifiers versus SVM on the Iris dataset

Decision boundary of semi-supervised classifiers versus SVM on the Iris dataset

Effect of varying threshold for self-training

Effect of varying threshold for self-training

Semi-supervised Classification on a Text Dataset

Semi-supervised Classification on a Text Dataset

Label Propagation digits active learning

Label Propagation digits active learning

Support Vector Machines

Examples concerning the sklearn.svm module.

Non-linear SVM

Non-linear SVM

SVM: Maximum margin separating hyperplane

SVM: Maximum margin separating hyperplane

SVM with custom kernel

SVM with custom kernel

SVM Tie Breaking Example

SVM Tie Breaking Example

SVM: Weighted samples

SVM: Weighted samples

Plot the support vectors in LinearSVC

Plot the support vectors in LinearSVC

SVM: Separating hyperplane for unbalanced classes

SVM: Separating hyperplane for unbalanced classes

SVM-Kernels

SVM-Kernels

SVM-Anova: SVM with univariate feature selection

SVM-Anova: SVM with univariate feature selection

Support Vector Regression (SVR) using linear and non-linear kernels

Support Vector Regression (SVR) using linear and non-linear kernels

SVM Margins Example

SVM Margins Example

One-class SVM with non-linear kernel (RBF)

One-class SVM with non-linear kernel (RBF)

Plot different SVM classifiers in the iris dataset

Plot different SVM classifiers in the iris dataset

Scaling the regularization parameter for SVCs

Scaling the regularization parameter for SVCs

RBF SVM parameters

RBF SVM parameters

Tutorial exercises

Exercises for the tutorials

Digits Classification Exercise

Digits Classification Exercise

Cross-validation on Digits Dataset Exercise

Cross-validation on Digits Dataset Exercise

SVM Exercise

SVM Exercise

Cross-validation on diabetes Dataset Exercise

Cross-validation on diabetes Dataset Exercise

Working with text documents

Examples concerning the sklearn.feature_extraction.text module.

FeatureHasher and DictVectorizer Comparison

FeatureHasher and DictVectorizer Comparison

Clustering text documents using k-means

Clustering text documents using k-means

Classification of text documents using sparse features

Classification of text documents using sparse features

© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/auto_examples/index.html